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1  Technical  Abstract 


In  this  final  technical  report  for  the  phase  I  Air  Force  STTR  contract  FA9550-08-C-0044,  the  technical  objec¬ 
tives,  work  accomplished,  results,  and  technical  feasibility  are  summarized.  The  first  and  primary  objective 
of  this  research  is  to  systematically  study  the  role  of  noise  in  human  decision  making.  We  achieve  this 
objective  by  showing  that  stochastic  resonance  (SR)  like  behavior  arises  in  the  Wald’s  sequential  probability 
ratio  test  (SPRT)  model  when  the  actual  input  signal  is  significantly  weaker  than  anticipated  by  the  model. 
We  derive  expressions  for  calculating  the  fraction  of  correct  responses,  the  mean  decision  time,  and  the 
reward  rate  for  the  SPRT/DDD  (discrete  drift  diffusion)  model  of  decision  making.  We  then  demonstrate 
that  both  the  fraction  of  correct  responses  and  the  reward  rate  have  a  peak  as  a  function  of  noise  strength 
while  the  mean  decision  time  is  a  monotonically  increasing  function  of  the  noise  strength.  We  also  examine 
the  dependence  of  our  results  on  the  initial  condition  and  the  form  of  the  input  probability  distributions. 
Finally,  to  gain  analytical  insights,  we  consider  the  continuous  time  limit  of  the  SPRT/DDD  model.  We 
show  that  the  closed-form  expressions  from  the  resulting  continuous  drift  diffusion  (CDD)  model  help  us 
understand  the  SR-type  behavior  in  the  SPRT/DDD  model  but  there  are  also  important  differences  between 
the  discrete  and  continuous  time  cases.  Therefore,  appropriate  amount  of  noise  can  improve  the  decision 
making  process  when  the  input  signal  is  weak. 

The  second  objective  of  the  project  is  to  study  adaptive  SPRT.  That  is,  we  study  the  role  of  prior 
distribution  in  an  adaptive  SPRT  algorithm.  One  simple  adaptive  SPRT  algorithm  without  prior  distribution 
is  first  presented.  The  simple  algorithm  is  based  on  the  estimation  of  the  real  distributions.  The  obtained 
result  shows  that  the  algorithm  makes  many  quick  and  wrong  decisions  and  sacrifices  accuracy.  Then,  we 
use  a  wide  prior  distribution  to  improve  the  robustness  of  the  algorithm.  The  new  algorithm  combines  the 
prior  distribution  and  the  information  from  the  input  data  to  estimate  the  real  distributions.  The  simulation 
result  shows  improvement  in  terms  of  accuracy  while  the  response  time  remains  moderate.  We  also  plot 
accuracy  and  reward  rate  against  the  weight  of  the  prior  probability.  It  is  observed  that  there  exists  an 
optimal  non-zero  and  finite  weight  to  achieve  the  best  accuracy  when  the  real  distributions  are  wider  than 
the  prior  distribution.  Finally,  we  examine  the  evolving  performance  of  this  adaptive  SPRT  algorithm  with 
prior  distribution  when  feedback  is  provided. 

A  third  but  secondary  objective  of  the  project  is  to  exploit  the  role  of  dynamic  systems  and  control 
theories  in  decision  making.  We  propose  an  extension  of  Busemeyer’s  decision  field  theory  (DFT)  by  means 
of  the  contraction  mapping  principle.  It  is  believed  that  such  an  extended  DFT  will  assist  operators  in 
making  better  decision  in  supervisory  control  situations  where  manual  control  and  automation  coexist. 


2  Introduction:  Team  Members  and  Phase  I  Publication  Summary 
2.1  Project  Members 

Principal  Investigator:  Zhong-Ping  Jiang,  Professor  and  IEEE  Eellow 

Department  of  Electrical  and  Computer  Engineering,  Polytechnic  Institute  of  New  York  University,  Brooklyn. 

Dr.  Jiang  is  in  charge  of  all  aspects  of  this  project.  His  expertise  is  in  nonlinear  control  theory,  nonlinear 
dynamics,  optimization  and  their  applications. 

Project  Manager:  Xiaoming  Zhuang,  President 
Applied  SR  Technologies  Inc.,  Brooklyn,  NY 

President  Zhuang  is  in  charge  of  the  coordination  of  this  project. 

Academic  Researcher:  Xingxing  Wu,  Ph.D  and  part-time  researcher 

Department  of  Electrical  and  Computer  Engineering,  Polytechnic  Institute  of  NYU,  Brooklyn,  NY 

Under  the  guidance  of  Professor  Jiang,  Xingxing  plays  an  active  role  in  the  research  efforts  on  the  theory 
of  stochastic  resonance  and  applications  in  signal  and  image  processing.  He  also  helps  graduate  students  on 
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the  SR-related  issues. 


Visiting  Scholar:  Jerome  Busemeyer,  Professor 
Psychology  Department,  Indiana  University 

Dr.  Busemeyer  provides  general  comments  on  the  research  related  to  his  DFT  and  how  dynamic  system  (and 
control)  theory  is  useful  in  developing  computational  and  mathematical  models  for  human  decision  making. 

Consultant:  Ning  Qian,  Associate  Professor 

Department  of  Neuroscience,  Columbia  University,  1051  Riverside  Dr.,  New  York. 

Dr.  Qian  plays  a  role  in  the  research  using  his  expertise  in  neuroscience  and  detection  theory,  in  particular 
his  prior  work  on  the  development  of  models  for  stochastic-resonance- type  behavior  in  sensory  perception. 

Research  Students:  Xiao  Han,  Feng  Ma,  Shiyun  Xu,  Juan  Zhang 

These  graduate  students  help  us  with  various  tasks  such  as  computer  simulations  using  Matlab  and  C-|— k, 
extensive  literature  review,  preparation  of  technical  reports. 

2.2  Publications 

Our  Phase  I  research  has  gained  significant  progress.  This  has  lead  to  the  following  publications : 


•  S.  Xu,  Z.  P.  Jiang,  L.  Huang  and  Daniel  W.  Repperger,  “Control-oriented  approaches  in  dynamic 
decision  making,”  Proceedings  of  the  8th  WSEAS  Int.  Conference  on  Robotics,  Control  and  Manu¬ 
facturing  Technology,  April  2008,  Hangzhou,  pp.  138-146. 

•  X.  Han,  Z.P.  Jiang,  D.  Repperger  and  N.  Qian,  Stochastic-resonance-like  behavior  in  Wald’s  sequential 
probability  ratio  test  model  for  decision  making,  to  be  submitted  to  Neural  Computation. 

•  X.  Han,  Z.P.  Jiang  and  N.  Qian,  The  role  of  prior  distribution  in  an  adaptive  SPRT  algorithm.  Under 
preparation. 

3  Technical  Objectives 

The  primary  phase  I  research  and  development  objectives  are  as  follows: 

1.  Investigation  of  stochastic  resonance  (SR)  behavior  in  the  sequential  probability  ratio  test  (SPRT) 
model  for  decision  making.  Stochastic  resonance  can  enhance  signal  detection  by  improving  certain 
performance  measures  such  as  signal-to-noise  ratio,  reward  rate  and  fraction  of  correct  responses.  One 
of  the  project  tasks  is  to  demonstrate  the  stochastic  resonance  phenomenon  in  the  SPRT  model  by 
taking  advantage  of  noise.  As  is  well-known,  the  SPRT  model  and  the  equivalent  discrete  drift  diffusion 
(DDD)  model  have  been  widely  used  to  explain  human  and  animal  decision  making  in  psychophysical 
tasks.  These  models  assume  that  observers  gradually  accumulate  evidence  from  noisy  inputs  and  make 
a  decision  when  the  evidence  reaches  a  threshold.  In  this  report,  we  show  that  stochastic-resonance 
type  behavior  arises  in  the  SPRT  model  when  the  actual  input  signal  is  significantly  weaker  than 
anticipated  by  the  model.  Specifically,  we  derive  expressions  for  calculating  the  fraction  of  correct 
responses,  the  mean  decision  time,  and  the  reward  rate  for  the  SPRT  model  of  decision  making.  We 
then  demonstrate  that  both  the  fraction  of  correct  responses  and  the  reward  rate  have  a  peak  as  a 
function  of  noise  strength  while  the  mean  decision  time  is  a  monotonically  increasing  function  of  noise 
strength. 
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2.  Differences  between  the  predictions  of  the  CDD  model  and  the  DDD  model  in  dynamic  decision 
making.  Choice  of  which  model  for  decision  making  depends  on  what  type  of  questions  we  want 
to  answer.  We  propose  to  develop  mathematically  rigorous  analysis  to  illustrate  the  key  differences 
between  the  predictions  of  continuous-  and  discrete-time  diffusion  models  known  as  CDD  and  DDD. 
We  have  made  some  interesting  observations  which  are  consistent  with  MATLAB-based  computer 
simulations.  First,  if  the  SPRT/DDD  model  selects  the  thresholds  according  to  the  equations  from 
the  CDD  model,  then  CR  (the  mathematical  expectation  of  accuracy)  would  be  better  than  desired, 
but  at  the  price  of  longer  response  time  than  the  anticipation  of  the  CDD  model.  Second,  the  response 
time  always  exhibits  a  peak  which  is  not  predicted  by  the  CDD  model.  As  a  result,  if  the  actual  noise 
strength  is  to  the  right  of  the  peak,  then  a  moderate  amount  of  extra  noise  could  possibly  reduce 
the  response  time  without  violating  the  constraints  on  the  Type  I  and  Type  II  errors,  due  to  the 
first  difference.  Third,  the  tail  of  the  reward  rate  curve  drops  significantly  with  stronger  noise,  while 
the  CDD  model  predicts  constant  reward  rate.  Such  a  result  may  be  important  for  psychological 
experiments. 

3.  Effects  of  non-zero  initial  condition  and  extensions  to  other  forms  of  distributions.  For  the  purpose  of 
testing  the  robustness  of  the  SR-like  behavior,  we  look  at  the  CR  behavior  when  the  initial  diffusion 
position  is  biased  (because  of  asymmetric  prior  probability)  or  asymmetric  constraints  on  Type  I  and 
Type  II  errors.  The  simulation  results  show  that  the  SR-like  behavior  gets  more  evident  as  initial  bias 
moves  toward  the  upper  threshold  and  weaker  if  the  initial  bias  is  negative.  Numerical  studies  are 
also  run  on  other  forms  of  distributions  such  as  Gamma  distribution.  For  the  Gamma  distribution 
with  wrongly  assumed  shape  parameter  K,  it  is  observed  that  SR-like  behavior  also  occurs.  Unlike 
the  Gaussian  case,  the  peak  gets  higher  and  narrower  with  larger  Z  but  the  position  of  the  peak  does 
not  seem  to  move  much  with  Z.  Like  the  Gaussian  case,  the  response  time  also  exhibits  a  peak  and 
gets  longer  almost  linearly  with  increasing  Z  and  the  reward  rate  also  exhibits  a  peak  and  gets  smaller 
with  increasing  Z. 

4.  Adaptive  SPRT.  We  study  the  role  of  prior  distribution  in  dynamic  decision  making  via  an  adaptive 
SPRT  algorithm.  The  ultimate  goal  is  to  use  a  wide  prior  distribution  to  improve  the  robustness  of 
the  algorithm.  The  key  strategy  is  to  develop  a  new  algorithm  that  combines  the  prior  distribution 
and  the  information  from  the  input  data  to  estimate  the  real  distributions.  In  addition,  we  examine 
the  evolving  performance  of  this  new  adaptive  SPRT  algorithm  with  prior  distribution  when  feedback 
is  provided. 

5.  Applications  of  control  and  dynamic  system  theories  to  the  development  of  mathematical  and  com¬ 
putational  models  for  human  dynamic  decision  making.  It  is  widely  recognized  that  feedback  and 
control  play  a  crucial  role  in  human  decision  making.  We  have  obtained  some  control-theoretic  results 
on  dynamic  decision  making  using  the  celebrated  contraction  mapping  principle.  It  is  shown  that  an 
extension  of  Busemeyer’s  Decision  Field  Theory  (DFT),  named  as  Generalized  DFT,  can  be  proposed 
to  solve  supervisory  control  problems  involving  manual  control  and  automation.  An  application  to 
the  benchmark  example  of  sugar  factory  task  yields  promising  results.  The  obtained  results  have 
been  presented  at  the  8th  WEAS  International  Conference  ROCOM  (April  2008,  pp.  138-146,  ISBN 
978-960-6766-51-0),  and  an  expanded  version  has  been  selected  for  publication  in  a  special  issue  of  a 
Springer  journal. 

4  Technical  Approach  and  Accomplished  Work 
4.1  Background 

Stochastic  resonance  (SR)  occurs  when  output  signal-to-noise  ratio  in  a  nonlinear  system  is  maximized  by 
a  moderate  level  of  noise.  The  term  is  also  used  generally  to  describe  any  phenomena  where  noise  plays  a 
positive  role.  The  SR  was  first  observed  in  the  physics  literature  (Benzi  et  al.  [1]),  and  then  widely  studied 
in  other  fields  of  sciences  and  engineering.  Gammaitoni,  et  al.  have  an  in-depth  review  on  SR  [2].  The 
SR-like  behavior  has  also  been  well  documented  in  psychophysical  experiments.  For  example,  Collins  and 
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his  co-workers  [11,  12]  found  that  human  detection  of  weak  tactile  stimulus  can  be  enhanced  by  adding  a 
certain  amount  of  noise.  Gong  et  al.  [13]  have  proposed  a  theory  that  explains  Collins  et  al.’s  data  very  well. 

An  integral  component  of  any  psychophysical  experiment  is  decision  making.  In  the  fixed-time  (or 
interrogation)  paradigm  used  in  many  experiments,  including  those  of  Collins  et  al.  [11,  12],  a  stimulus  is 
presented  for  a  fixed  duration  in  each  trial,  and  an  observer  has  to  decide,  for  example,  whether  the  stimulus  is 
noise  or  signal  plus  noise.  However,  a  different  approach,  called  the  reaction-time  (or  free-response)  paradigm, 
is  often  employed  when  one  wants  to  study  the  dynamical  decision  making  process  more  explicitly.  In  this 
paradigm,  the  stimulus  presentation  in  each  trial  is  terminated  by  the  observer  when  he  or  she  feels  ready  to 
make  a  decision.  Typically,  observers  earn  points  for  correct  decisions  and  lose  points  for  wrong  ones.  Since 
longer  time  is  needed  to  make  more  accurate  decisions,  observers  have  to  compromise  between  speed  and 
accuracy  to  maximize  the  total  points  per  unit  time  (reward  rate) . 

Wald’s  SPRT  model  and  the  equivalent  DDD  model  [3,  4]  provide  a  natural  framework  for  understanding 
decision  making  in  the  reaction-time  paradigm.  These  models  assume  that  observers  gradually  accumulate 
evidence  from  noisy  inputs  and  make  a  decision  when  the  evidence  reaches  a  threshold.  In  most  psychophys¬ 
ical  experiments,  observers  have  to  decide  between  two  alternatives  (e.g.,  presence  or  absence  of  a  signal, 
leftward  or  rightward  motion).  This  maps  naturally  to  the  upper  and  lower  thresholds  in  the  models.  The 
models  choose  one  or  the  other  alternative  depending  on  whether  the  accumulated  evidence  reaches  the 
upper  or  the  lower  threshold.  The  models  have  been  applied  to  a  wide  range  of  psychophysical  experiments; 
see  [8]  for  a  recent  review. 

While  there  are  both  experimental  and  theoretical  studies  of  the  SR-like  behavior  in  the  fixed-time 
paradigm  [11,  12,  13],  to  our  knowledge  no  SR  studies  have  been  done  for  the  reaction-time  paradigm  and 
the  associated  models  for  decision  making.  The  focus  of  this  paper  is  to  demonstrate  SR-like  behavior  in  the 
SPRT /DDD  model,  and  in  its  continuous  time  limit,  the  CDD  model.  Previous  applications  of  these  models 
[7]  often  assume  that  the  decision-making  process  has  perfect  knowledge  of  the  probability  distributions 
from  which  inputs  are  drawn.  This  is,  however,  unlikely  to  be  true.  Instead,  the  brain  must  rely  on  its  prior 
experience  over  a  long  period  of  time  to  estimate  the  probability  distributions  of  the  inputs.  Consequently, 
the  estimated  distributions  must  be  different  from  the  actual  ones  arbitrarily  picked  by  experimenters  for  a 
given  psychophysical  experiment.  In  particular,  when  the  input  signals  are  very  weak,  the  decision  process 
must  incorrectly  assume  that  the  signals  are  drawn  from  distributions  containing  much  stronger  signals.  One 
of  our  research  goals  is  to  show  that  this  is  precisely  the  situation  where  the  SR-like  behavior  emerges. 


4.2  Stochastic  Resonance  Behavior  in  Wald’s  Sequential  Probability  Ratio  Test 
(SPRT) 

4.2.1  SPRT  and  Its  Drift  Diffusion  Models 


Take  as  an  example  the  two-alternative  forced  choice  (2AFC)  task  in  the  reaction  time  paradigm.  The  Wald’s 
SPRT  model  for  decision  making  is  based  on  the  probability  distributions  of  the  input  x  under  hypotheses 
Hq  and  Hi'. 

p{x\Hn),  n  =  0,l 


For  each  new  sampled  input  Xk,  the  model  accumulates  evidence  by  updating  the  running  product  of  likeli¬ 
hood  ratios: 


T(fco)  = 


p{xk\Hi) 

ljlP{xk\Ho) 


and  compares  the  product  with  a  lower  threshold  ^ •  -jn—  and  an  upper  threshold  ^ where 
p{Hi)  is  the  prior  probability  of  Ri,  P  is  the  power  of  the  test  and  a  is  the  significance  level  of  the  test.  The 
thresholds  [4]  are  determined  so  that  the  probability  of  Type  I  error  {Hi  is  correct  but  the  model  chooses 
Hq)  is  1  —  P  and  the  probability  of  Type  II  error  {Hq  is  correct  but  the  model  chooses  Hi)  is  a.  If  the 
product  is  smaller  than  the  lower  threshold,  the  model  decides  that  Hq  is  correct  (This  decision  is  called  Dq). 
If  the  product  is  larger  than  the  upper  threshold,  then  the  model  decides  that  Hi  is  correct  (This  decision  is 
called  Pi).  If  neither  condition  is  satisfied,  the  model  continues  to  sample  the  input  and  update  the  product 
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until  one  of  the  thresholds  is  reached.  The  SPRT  model  is  optimal  (Wald  [3,  4])  in  the  sense  that  for  fixed 
constraints  on  type  I  and  type  II  errors,  no  other  test  allows  a  shorter  decision  time  on  average. 

By  taking  logarithm,  we  can  convert  the  product  of  likelihood  ratios  into  a  sum  of  log  likelihood  ratios 


fcp 

Y (ko)  =  log  {L{ko))  =  ^  log 


\p{xk\Ho)) 


(1) 


which  is  referred  to  as  the  discrete  drift  diffusion  (DDD)  model,  with  the  new  diffusion  boundaries  Zq  and 


Zi: 


Zo  =  log 


A-p(gi) 

I  P{Hi) 


1  -  P 
1  —  a 


Zi 


A-p(gi) 
V  PiHi 


(2) 


Note  that  Sk  is  the  diffusion  steps  while  Y{ko)  is  the  diffusion  position  after  ko  steps. 

On  the  other  hand,  when  the  diffusion  steps  are  infinitesimally  small,  the  SPRT /DDD  model  converges 
to  the  ODD  model  which  is  described  by  the  following  stochastic  equation 


dy  =  Adt  +  cdW]  y{0)  =  yo- 


(3) 


where  A  is  the  drift  per  unit  time,  W  is  standard  Weiner  process  with  expected  value  0  and  variance  1,  c  is 
the  standard  deviation  of  the  Weiner  process,  and  7/(0)  is  the  initial  bias.  We  will  first  consider  the  case  of 
zero  initial  bias  and  then  discuss  the  case  of  nonzero  initial  conditions. 

The  ODD  parameters  A  and  c  correspond  to  the  mathematical  expectation  and  standard  deviation  of  a 
single  diffusion  step  s  in  the  DDD  model.  To  be  specific,  and  use  An  to  denote  the  drift  rate  under  hypothes 


Hn 


An  — 


(2/r„  - /to  -  Ai)(Ai  -  Ao)  _cr(Ai-Ao) 

2d2  ’  ^  “  (t2  ^ 


n  =  0, 1 


(4) 


4.2.2  Some  Technical  Assumptions 

DC  signal  and  signal-independent  Gaussian  noise:  We  assume  that  the  signal  is  DC  and  the  noise  is 
Gaussian  and  signal-independent.  This  is  justified  when  the  two  hypotheses  are  symmetric  such  as  leftward 
motion  vs.  rightward  motion.  It  can  also  be  justified  for  asymmetric  hypotheses  such  as  “noise  vs.  signal 
plus  noise”  as  long  as  the  signal  is  very  weak. 

To  be  specific,  the  actual  distributions  from  which  the  inputs  are  drawn  under  Hq  and  Hi  are  given  by: 

p{x\Ho)  ~  A(/xo,cr^) 
p{x\Hi)  ~  A(/xi,cr2) 

where  N{pL,a^)  denotes  Gaussian  distribution  with  mean  y  and  variance  cr^. 

SPRT  parameters  inaccurate  and  non-adaptive:  Since  the  brain  does  not  have  direct  access  to  these 
distributions  and  has  to  rely  on  prior  experience  to  estimate  these  distributions,  we  assume  that  the  distri¬ 
butions  used  in  the  model  for  calculating  the  likelihood  ratios  are  different: 

p{x\Ho)  ~  A(Ao,d-2) 

p{x\Hi)  ~  A(Ai,ct^) 

Throughout  this  paper,  we  will  analyze  the  non-adaptive  SPRT  model.  The  SPRT  parameters  are  assumed 
by  the  SPRT  model  prior  to  the  decision  making  process  and  does  not  adapt  to  the  input  data. 
Symmetry:  We  further  assume  that  the  two  hypotheses  are  equally  probable,  that  is, 

p{Ho)=p{Hi)  =  QA 

and  the  constraints  on  Type  I  and  Type  II  errors  are  symmetric,  that  is, 

P  -I-  a  =  1 
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Following  these  assumptions,  the  boundaries  (2)  for  the  diffusion  process  are  symmetric  with  respect  to  zero: 


Zq  =  log 


1-P 

P 


Zi  =  log 


P 

1  -  P 


We  define  the  distance  from  the  boundaries  to  zero  to  be  Z,  i.e.  Z  =  Zi  =  —Zq.  As  a  result,  we  have 

.2  1 


P  = 


and  a  = 


where  Z  >  0. 


1  +  1  +  ’ 

Under  these  assumptions,  a  single  diffusion  step  in  the  DDD  model  is: 


(5) 


(6) 


Sfc  =  log 


p{xk\Hi)\ 


=  log 


TTCr^ 


V2 


TTa^ 


_  2xfc(/ii  -  /to)  +  Ao  -  Ai 
2o-2 

Since  the  input  Xk  is  Gaussian  distributed,  and  Sk  is  linearly  related  to  Xk,  Sk  is  also  Gaussian  distributed. 
The  probability  distributions  of  Sk  under  Hq  and  Pli  are  given  by: 


p{sk\Ho)  ~  N  (  (2mo-Ao-A)(Ai-Ao)  ^  , 


p{sk\Hi) 


N 


2<t2 


(7) 


4.2.3  Expressions  for  the  performance  measures  of  the  SPRT/DDD  model 

To  demonstrate  the  SR- like  behavior  in  the  SPRT/DDD  model,  we  need  to  derive  the  expressions  for  the 
fraction  of  correct  responses,  the  mean  decicion  time,  and  the  reward  rate  of  the  SPRT/DDD  model: 


CR 

=  p{Ho)  ■  p{Do\Ho)  +  p{H,)  ■  p{D,\Hi)  , 

(8) 

{DT) 

=  p{Ho)  ■  {DTh,)+p{H^)  ■  (DTh,)  , 

(9) 

RR 

CR-q-ER 
{DT)  +  To 

(10) 

According  to  equations  (8),  (9),  (10),  we  only  need  to  derive  the  expressions  for  the  first  passage  probability 
p{Di\Hn)  and  the  mean  decision  time  {DTh„)- 
By  decomposition,  we  have 


p(Di|R„)=  ^p(Di,DT=fco|Rn),  n  =  0,l 

ko  =  l 
oo 

{DTHj=Y.ko-p{DT=ko\H^),  n  =  0,l 

ko  =  l 

and 

p{DT  =  ko\H„)  =  p{D,,DT=ko\H„)  +p{Do,DT  =  ko\H„) 

Thus,  to  determine  all  the  performance  measures,  we  only  need  to  calculate 

p{Di,DT  =  ko\Hn)  and  p{Do,  DT=ko\Hn),  n  =  0, 1 

i.e.  the  probability  of  passing  the  upper  boundary  at  time  ko  and  the  probability  of  passing  the  lower 
boundary  at  time  ko- 

We  will  use  Figure  1  to  illustrate  the  derivation.  We  will  also  use  gH„  (y)  to  denote  the  PDF  of  a  single 
diffusion  step,  where  n  =  0, 1. 


0.12  - 
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Figure  1:  Functions  fko,Hn{y),  with  Z  =  3. 


Apparently,  the  PDF  of  Y (1)  is  simply  the  PDF  of  a  single  diffusion  step,  gH„  (y).  We  define  fi,H„  (u)  as 
9H„{y)-  As  a  result,  the  probability  of  passing  the  upper  boundary  at  time  ko  =  1  is  the  areaS  in  /i._y„(y). 

If  the  first  step  of  diffusion  reached  neither  boundaries  {—Z  and  Z),  then  with  this  further  information, 
the  conditional  PDF  of  P(l)  (We  will  denote  this  condition  as  DT  >  1)  is  the  part2  of  /i,//„(y)  multiplied 
by  a  normalization  factor. 

We  will  denote  the  part2  of  /i,u„(y)  as  7’(/i,u„(y))  where  T(-)  is  a  truncation  operator  which  takes  out 
the  part  of  a  function  between  —Z  and  Z . 

By  definition,  Y{ko)  =  P (fco  ~  1)  +  Skg ,  where  P(fco  —  1)  and  Skg  are  independent  random  variables,  thus 
the  PDF  of  Y{2)  is  the  convolution  of  'F(/i.u„(y))  and  yu„(y)  multiplied  by  a  normalization  factor. 

If  we  define  /2,u„(y)  as  F(/i,u„(y))  *  9H„{y)  where  *  denotes  convolution,  then  the  PDF  of  Y{2)  is 

f2M„  (y)/  f_Z  h,H„  iy)  dy. 

Generally,  we  will  define 


fko+i,H^iy)  =  T{fko,H„iy))  *9H„{y)  fco  >  i 


(11) 


Proposition  4.1:  p{DT  >  ko\Hn)  = 


fko.Hr.iy)- 


/oo 

fi,H„iy)-  Generally,  if  piDT  >  ko\Hn) 

-OO 


9 


/_^  fko,H„{y)  for  a  certain  ko,  then  p{DT  >  fco  +  ^Hn)  is  simply  the  area2  of  fko,H„{y)-  To  be  explicit, 

/Z  p  oo 

fko,HSy)=  /  fko+i,HAy) 

-Z  J  —  oo 

Proposition  4.2:  The  conditonal  PDF  of  r(fco)  (with  condition  DT  >  ko)  is  fko,Hr.iy)/ f-Z  fka^Sy)  dy- 
Proof:  The  conditional  PDF  of  P(l)  (with  condition  DT  >  1)  is  apparently 

fi^Ay)  I J  fiM^iy)  dy  =  9H„{y) 

Generally,  if  the  conditional  PDF  of  Y{ko)  is  fko,H„{y)/ f_Z  fko,H„{y)  dy,  and  the  /coth  step  of  diffusion  did 
not  reach  any  boundaries  {—Z  and  Z),  neither,  thus  the  condition  changes  from  DT  >  ko  to  DT  >  /cq  +  1, 
and  the  conditional  PDF  of  Y  {ko  +  1)  is 

C^ko -T  (^fko,H„{y)  ^ J  fko,HAy)  dy^  *  gH„{y) 

=  (^Cko  ^  J  fko,H„{y)dy^  ■  fko+i,H„{y)  =  Nko+1  ■  fko+i,Hr,{y)  (12) 


where  Cko  and  Nk^+i  are  normalization  factors.  The  normalization  factor  Nkg+i  is  apparently  1/ fko+i,H„(y)  dy 
to  ensure  the  total  area  of  a  PDF  is  1.  Thus  the  conditional  PDF  of  F(fco+l)  is  fko+i,H„  (y) /  f_Z  fko+i,Hr,  (y)  dy. 

pOO 

Proposition  4.1.3:  The  probability  of  passing  the  upper  boundary  at  time  ko  is  /  fko,H„{y)- 

Proof:  In  other  words,  this  is  the  probability  that  the  first  ko  —  I  steps  reached  neither  boundaries  and 
the  fcoth  step  crosses  the  upper  boundary,  to  be  specific,  “DT  >  ko  and  Y{ko)  >  Z” . 


p{Di,DT  =  ko\Hn)  =  p{Y{ko)  >  Z\DT  >  •  p{DT  >  ko\Hn) 

^  IZ  fko,HSy)  ^  f 
J-Zfko,H„{y)  J- 


/oo  poo 

fko,HAy)=  /  fko.nSy)  (13) 

-oo  J  Z 


4.2.4  Condition  for  the  SR-like  behavior  in  the  CDD  model 

The  CDD  model  predicts  that,  under  the  above-mentioned  assumptions, 

p{Do\Ho)  =  1  —  2^  _|_  piAoZjc^  ’  P(T*il-^i)  =  2  -I- 

(DTf/„)  =  .  {DTH^)  =  -^tanh 

As  a  result,  (according  to  (8)  and  (9)), 

,  ,  Z  ,  fAoZ\  Z  ,  fAiZ\  ,  , 

Under  the  assumptions  in  Subsection  4.2.2,  we  propose  that  the  sufficient  and  necessary  condition  for  the 
SR-like  behavior  is 

Ao  •  Ai  >  0  and  Ao  <  Ai  (17) 

Specifically,  CR,  as  a  function  of  cr,  has  a  global  maximum  at  a  non-zero  and  finite  a. 

The  above  condition  is  equivalent  to  (according  to  (4)), 

,  ^  Ao  +  Ai  Ao  +  Ai  ^  ,  /,  Q^ 

Po  <  yi<  - ^ - ,  or  - - -  <  po  <  Pi  (18) 

i.e.  the  signal  under  Hi  is  much  weaker  than  anticipated  or  the  signal  under  Ho  is  much  stronger  than 
anticipated. 
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4.3  Adaptive  SPRT 


The  presence  of  SR-like  behavior  in  Wald’s  SPRT  model  can  be  viewed  as  an  indication  of  the  mismatch 
between  the  actual  distribution  and  the  assumed  input  distribution  that  initiates  the  adaptive  process. 
Indeed,  it  is  possible  that  the  decision-making  process  in  the  brain  gradually  adapts  its  internally  assumed 
input  distributions  to  match  the  actual  distributions  if  feedback  is  provided.  There  has  been  a  previous 
study  on  adaptive  SPRT  [19].  However,  that  study  does  not  assume  any  prior  probability  distributions  and 
attempts  to  estimate  the  distributions  from  the  available  input  data  alone.  Therefore,  the  method  may  not 
be  robust  when  the  input  data  are  few. 

In  order  to  study  the  role  of  prior  distribution  in  adaptive  SPRT,  we  proceed  with  two  cases:  adaptive 
SPRT  without  prior  distribution  and  SPRT  with  prior  distribution.  First,  let  us  assume  that  the  input 
distributions  under  Hq  and  Hi  are  A/’(^o,cro)  We  want  the  SPRT  algorithm  to  decide 

between  Hq  and  Hi,  with  the  objective  of  minimizing  the  response  time  while  achieving  a  desired  accuracy 
described  by  power  P  and  significant  level  a. 

Adaptive  SPRT  algorithm  without  prior  distribution:  First  pass.  In  this  case,  the  algorithm  finds  the  unbiased 
MMSE  estimation  of  the  parameters  /tq,  ctq,  p,i,  a\  using  the  following  formulas: 

2=1  2=1 


It  turns  out  that 


A  -  1  20-2 


2  ’  A-  1 


with  A(/I)  =  /X,  Var{fi)  =  E{a^)  =  Var{a^)  = 

For  each  step  of  diffusion,  the  nth  step,  for  example,  the  model  estimates  the  model  parameters  based 
on  all  the  input  samples  except  the  current  sample,  i.e.  xi,X2,  ■  ■  ■ ,  Xn-i-  The  diffusion  size  is  then  computed 
using  the  new  model  parameters  /to,  ctq,  fii,  af. 

Note  that  if  we  only  make  one  decision,  then  we  do  not  know  whether  the  input  samples  are  drawn  from 
p(xlHo)  or  p(xlHi).  As  a  result,  we  can  only  adapt  the  common  parameters  of  p(xlHo)  and  p(xlHi).  For 
example,  if  (Tq  =  CTi  =  a,  then  we  can  estimate  a  using  the  input  samples  without  knowing  whether  they  are 
drawn  from  p(xlHo)  or  p{x\Hi). 

Next,  we  will  study  a  simple  case  when  /xq  =  0  and  /xi  =  ^  is  known  and  we  estimate  only  a  =  an  =  ui. 
In  this  case,  the  unbiased  MMSE  estimation  becomes: 


2  =  1 

where  cf^  follows  Chi-square  distribution  and  the  variance  is  ^ 

The  performance  of  this  simple  case  is  plotted  in  Figure  2  and  Figure  3.  The  desired  accuracy  is  0.95. 
This  simple  adaptive  SPRT  is  more  robust  to  noise  variance  but  achieves  only  about  0.83  accuracy.  As 
anticipated,  the  response  time  appears  quadratically  dependent  on  a. 

Adaptive  SPRT  algorithm  with  prior  distribution:  Second  pass  Suppose  that  we  want  to  estimate  a  random 
variable  a  using  observations  from  sensor  1  and  sensor  2.  The  two  observations  ai  and  a2  have  variances  cti 
and  (J2,  respectively.  It  turns  out  that  the  MMSE  estimation  of  a  is 

Cl  •  ai  -h  C2  •  0:2 

C1  +  C2 

where  Ci  =  l/af  and  C2  =  1/ct|. 

We  will  use  A(/xop,cr§p)  and  N{p.ip,<j\p)  to  denote  the  prior  distributions.  The  estimation  from  the 
input  data  are  Af{p,od,  o-q^)  and  Af{p,id,  <x\d)- 

To  combine  N{p.p,  a^)  and  Af{p,d,  cr^)  into  a^),  we  model  the  prior  distribution  Af{fip,  a^)  as  obser¬ 
vation  from  sensor  1  and  Af{p,d,  as  observation  from  sensor  2. 
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Figure  2:  CR  versus  sigma,  No  prior  distribution,  power=0.95,  significance=0.05.  /i  =  6, 
(Tp  =  20,  (T  G  {10, 12,  •  •  ■  ,  30}.  Averaged  over  10000  trials. 
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Figure  3:  CR  versus  sigma,  No  prior  distribution,  power=0.95,  significance=0.05.  /j,  =  Q, 
ap  =  20,  (T  G  {10, 12,  •  •  ■  ,  30}.  Averaged  over  10000  trials. 
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We  assume  that  the  prior  distribution  is  estimated  from  M  “virtual  inputs” .  The  virtual  inputs  are 
drawn  from  but  “unfortunately”  have  mean  fip  and  variance  Cp. 

When  fj,o  and  fj,i  are  unknown,  using  the  unbiased  MMSE  estimation,  the  formulas  for  combination  are 


M  ■  fj.p  +  N  ■  fj.d  _  {M -l)al  +  {N -l)aj 
M+N  ’  ^ “  M+N-2 


(19) 


where  M  is  the  number  of  “virtual  inputs”  and  N  is  the  number  of  “real  inputs” . 

When  /To  and  fj,i  are  known,  using  the  unbiased  MMSE  estimation,  the  formula  for  combination  is: 


d2  = 


(M^+  iN)aj 
'M  +  N 


(20) 


Intuitively,  prior  distribution  dominates  ii  M  >  N,  estimation  from  real  input  data  dominates  \i  M  <  N . 

The  performance  of  this  adaptive  SPRT  algorithm  (also  adapts  only  a)  with  prior  distribution  is  plotted 
in  Figure  4  and  Figure  5.  The  desired  accuracy  is  0.95.  This  adaptive  SPRT  with  prior  distribution  is  less 
robust  to  noise  variance  but  achieves  about  0.95  accuracy.  The  response  time  remains  moderate  compared 
with  the  simple  adaptive  SPRT  algorithm  without  prior  distribution. 


Figure  4:  CR  versus  sigma,  with  prior  distribution,  M=10,  power=0.95,  significance=0.05. 
=  6,  Up  =  20,  cr  G  {10, 12,  •  •  •  ,  30}.  Averaged  over  10000  trials. 


4.4  Control-Theoretic  Approach  to  Dynamic  Decision  Making 

Dynamic  decision  making  tasks  include  important  activities  such  as  stock  trading,  air  traffic  control,  and 
managing  continuous  production  processes.  In  these  tasks,  decision  makers  make  multiple  recurring  decisions 
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Figure  5:  CR  versus  sigma,  with  prior  distribution,  M=10,  power=0.95,  significance=0.05. 
/i  =  6,  (Tp  =  20,  (T  G  {10, 12,  •  •  ■  ,  30}.  Averaged  over  10000  trials. 
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to  reach  a  target,  and  they  receive  feedback  on  the  outcome  of  their  efforts  along  the  way  (see  [45,  52,  54] 
and  the  references  therein). 

A  transfer  of  insights  from  other  related  domains  makes  it  possible  to  develop  a  formulation  of  learning 
building  on  the  application  of  control  theory  to  the  study  of  human  performance  in  dynamic  decision  making 
[53].  Brehmer  uses  control  theory  [37,  38]  as  a  framework  to  analyze  the  goal-directed  behavior  in  dynamic 
decision-making  environments.  People  who  use  less  sophisticated  environment  models  are  able  to  learn  to 
improve  their  performance  only  when  feedback  is  timely  and  continuous  [37,  38].  Jordan  and  Rumelhart 
[54,  56]  address  similar  issues  in  the  area  of  motor  learning.  A  key  idea  of  their  approach  to  dynamic 
decision  making  is  to  divide  the  learning  problem  into  two  interdependent  subproblems.  A  broad  set  of 
topics  including  feedback  control,  feedforward  control,  delay  and  learning  algorithms  are  then  introduced 
into  this  area  [55].  Gibson  [51]  inherits  Jordan’s  connectionist  network  and  applies  online  learning  in  parallel 
distributed  processing,  or  a  neural  network  control  model  to  illustrate  the  Sugar  Factory  (SF)  Task  [50]. 

The  SF  model  is  a  simple  dynamic  decision-making  task  in  which  decision  makers  are  expected  to  learn 
from  experience  [35,  50,  44].  It  is  of  interest  to  computational  organization  theorists,  and  there  have  been 
various  kinds  of  tests  conducted  on  it.  A  typical  phenomenon  arising  from  these  experiments  is  that  while 
participants  progressively  improve  their  capacity  to  control  the  system,  they  remain  unable  to  describe  how 
the  system  works  or  how  does  it  reach  the  target  value,  leading  to  large  amounts  of  repetitive  work  and  low 
efficiency.  Upon  such  backgrounds,  an  automatic  design  is  required  and  presented  as  a  reference. 

Automation  can  improve  the  efficiency  and  safety  of  complex  and  dangerous  operating  environments 
by  reducing  the  physical  or  mental  burden  on  human  operators  [64].  Despite  this  fact,  it  is  always  a 
critical  distinction  whether  or  not  automation  is  engaged,  and  the  operator’s  role  has  to  be  changed  from 
controllers  directly  involved  with  the  system  to  supervisory  controllers  [58].  In  such  supervisory  control 
systems,  operators  monitor  the  performance  of  automation  during  normal  operations,  and  intervene  to  take 
manual  control  when  necessary. 

Studies  have  shown  that  operator’s  use  of  automation  reflects  automation  reliability,  and  inappropriate 
reliance  associated  with  misuse  and  disuse  partly  depends  on  how  well  trust  matches  the  true  capabilities 
of  the  automation  [68].  In  order  to  guide  design,  evaluation  and  training  to  enhance  human-automation 
partnerships  as  well  as  high  specificity  of  trust  are  required,  and  through  which  misuse  and  disuse  of  au¬ 
tomation  can  be  mitigated  [59] .  Consequently,  a  better  operator  knowledge  of  how  automation  works  and 
the  automation  design  philosophy  are  both  required  for  more  appropriate  use  of  automation  [63] . 

The  operator’s  choice  plays  such  an  important  role  in  the  automated  system  performance  that  the 
allocation  of  functions  is  becoming  a  critical  decision  making  process,  and  to  optimize  this  process  will  be  of 
great  importance  [60].  A  dynamic  approach  capitalized  on  the  power  of  the  DFT  (Decision  Field  Theory) 
has  been  developed  to  characterize  operators’  reliance  on  automation  in  a  supervisory  control  system  by 
describing  a  quantitative  model  of  trust  in  automation,  and  an  EDFT  (Extended  DFT)  model  is  proposed 
[49] .  As  trust  and  self-confidence  are  closely  associated  with  the  capacity  of  automation  and  manual  control 
separately  [57,  69],  it  behooves  us  to  improve  the  existed  model  in  order  to  help  the  operator  gain  a  better 
understanding  of  capacities. 

Our  research  is  targeted  at  developing  a  framework  to  modify  the  EDFT  method  based  on  the  celebrated 
contraction  mapping  principle.  The  result  is  supported  via  computation  simulations  on  the  benchmark 
example  of  Sugar  Factory  supervisory  control  scenario. 

Due  to  the  complexity  and  variability  of  automation  performance,  the  operator’s  choice  between  au¬ 
tomatic  and  manual  control  in  supervisory  control  situations  can  be  considered  both  a  preferential  choice 
problem  and  a  decision-making  process  described  by  Decision  Field  Theory  (DFT)  [41,  42].  The  standard 
elementary  DFT  model  used  to  investigate  decision  making  under  risk  or  uncertainty  could  be  described 
through  a  straightforward  example  in  supervisory  control.  Suppose  one  is  facing  the  problem  of  choosing 
whether  to  rely  on  automation  (A)  or  to  intervene  with  manual  control  (M),  as  shown  in  the  following  chart. 
In  Figure  5,  and  S2  are  two  interdependent  uncertain  events,  one  of  which  may  occur  at  a  certain  time 
point.  Si  denotes  the  occurrence  of  an  automation  fault  and  S'2  represents  the  incidence  of  a  fault  that 
compromises  manual  control.  During  the  course  of  decision  making,  the  valence  of  an  action  Vi  (i  =  A  or 
M)  is  defined  as  the  subjective  expected  payoff  for  each  action  also  fluctuates  from  sample  to  sample,  which 
is  relevant  to  the  subjective  probability  weight  W{Sj)  and  the  utility  of  the  payoff  [40].  The  preference  state 
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Figure  6:  DFT  choosing  model  in  a  supervisory  control  situation  [49] 


at  sample  n  is  derived  based  on  the  accumulated  valence  difference: 

P(n)  =  (1  -  s)  X  P(n  -  1)  +  [VAin)  -  Ymin)] 
=  (1  —  s)  X  P{n  —  1)  +  [d  +  e(n)]. 


(21) 


Let  C  represent  the  true  capability  of  the  automation  (Ca)  or  manual  control  (Cm)-  The  former  symbol 
describes  the  reliability  of  the  automation  in  terms  of  fault  occurrence  and  general  ability  to  accomplish  the 
task  under  normal  conditions,  while  the  latter  one  describes  how  well  the  operator  can  manually  control  the 
system  in  various  situations.  Bq  denotes  the  belief  or  estimation  of  the  capability  of  automation  {Bqa)  or 
the  operator’s  manual  capability  (Bcm)-  In  the  EDFT  model,  sequential  decision  processes  are  linked  by 
dynamically  updating  beliefs  regarding  the  capability  of  automation  or  manual  control  based  on  previous 
decisions  in  order  to  guide  the  next  decision  as  follows  [49]: 

Bc{n)  =  Bc{n  -  1)  +  ^(C(n  -  1)  -  Bc{n  -  1)).  (22) 

Ol 

The  value  bi  (61  >  1)  represents  the  level  of  transparency  of  the  system  interface,  describing  how  well  infor¬ 
mation  is  conveyed  to  the  operator  when  capability  information  is  available.  &i  =  1  means  the  information 
is  perfectly  conveyed  to  the  operator.  The  larger  bi  is,  the  more  poorly  information  is  conveyed  to  operator. 

There  is  a  formulation  depicted  in  [48]  that  beliefs  represent  the  information  base  that  determines 
attitudes  and  then  attitudes  determine  intentions  and  consequently  behaviors.  Under  the  circumstance  of 
supervisory  control,  trust  and  self-confidence  are  both  attitudes  that  depend  on  beliefs,  while  at  the  same 
time,  they  determine  preference  and  reliance.  Take  T  and  SC  as  the  denotation  of  trust  and  self-confidence, 
which  are  updated  by  Bqa  and  Bcm  as  the  new  input  respectively.  Preference  of  A  over  M  is  defined  as 
the  difference  between  trust  and  self-confidence  at  time  step  n  in  the  EDFT  model,  denoted  by  P{n)  [49]: 

P{n)  =  T{n)  -  SC{n)  =  [(1  -  s)  x  T(n  -  1) 

-k  s  X  BcA{n)  +  e{n)]  -  [(1  -  s)  x  SC{n  -  1) 

-k  s  X  BcM{n)  +  e(n)]  =  (1  —  s)  x  P{n  —  1) 

-k  s  X  [BcA{n)  -  BcAiin)]  +  £p{n). 

Here  the  difference  between  Ca  and  Cm  corresponds  to  d,  and  P{n)  combined  with  other  factors  such  as 
time  constraints  will  determine  whether  to  actually  rely  on  automation  or  not. 

In  a  supervisory  control  system,  operators  are  sensitive  to  the  ability  of  predicting  the  capacity  of 
automation  or  manual  control,  and  previous  findings  suggest  that  operator’s  trust  is  closely  linked  with 
the  capacity  of  automation  [61].  More  specifically,  people’s  trust  on  automation  may  vary  according  to  the 
change  of  discrepancy  between  the  operators’  expectation  and  the  true  behavior  (the  capacity)  of  automation. 
Consequently,  though  it  is  useful  to  get  to  know  the  infiuence  of  capacity  C  on  trust,  it  is  necessary  to  examine 

whether  the  expectation  of  capacity  is  close  to  the  practical  situation  if  we  are  to  develop  a  predictive  model 

of  trust  in  automation  and  intervention  behavior.  Improving  the  accuracy  of  operators’  perception  to  the 
system  capacities  will  also  greatly  enhance  the  appropriateness  of  their  trust  in  automation.  Based  on  this, 
it  is  necessary  to  develop  a  modified  model  that  can  better  reflect  appropriate  trust. 
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One  of  the  effective  ways  to  modify  the  EDFT  model  is  to  consider  the  discrepancy  between  the  capacity 
of  two  sequential  time  steps.  Accordingly,  belief  is  expressed  as: 

Bc{n)  =  Bc{n  -  1)  +  -  1)  -  Bc{n  -  1)) 

0l 

+  (l-i)(C(n-l)-C(n-2)). 

Ol 

By  transposing  (24),  we  will  get: 

A(n-l)  =  (l-^)A(n-2), 

Ol 

where  X{n  —  1)  =  Bc{n  —  1)  —  C{n  —  2).  Equation  (25)  constitutes  a  contraction  mapping,  from  whose 
definition  we  know  that  X(n—  1)  converges  to  0  for  a  enough  large  n.  Consequently,  Bc(n—1)  will  eventually 
converge  to  C{n  —  2)  as  time  step  n  increases. 

Modifying  the  generation  of  belief  in  pattern  of  (24)  enables  operators  to  generate  their  belief  much 
closer  to  the  true  capacity,  and  it  provides  a  better  understanding  of  how  automation  works.  As  a  result,  the 
operators’  trust  in  automation  will  grow,  and  thus  lead  to  more  appropriate  reliance  on  automation.  The 
effectiveness  is  supported  by  simulation  results. 


(24) 

(25) 


5  Results 


5.1  Analytical  Expression  and  Conditions  for  the  SR-like  Behavior  in  the  SPRT 
Model 


Based  on  the  suggestion  of  Dr.  Jun  Zhang,  the  AFOSR  Program  Manager,  we  launched  and  completed 
a  rather  exhaustive  review  of  the  past  literature  on  SPRT  and  SR-related  work.  Particularly,  we  focused 
our  attention  on  the  papers  published  in  the  mainstream  journal  ’’Annals  of  Mathematical  Statistics”  from 
1945  to  2007,  where  the  pioneering  work  of  Wald  and  his  co-workers  (1945,  1948)  was  published.  We  also 
reviewed  past  literature  on  SR-like  behaviors  in  psychological  experiments,  as  well  as  some  textbooks  or 
edited  books  on  SPRT,  such  as  those  entitled  Detection  Theory,  Handbook  of  Sequential  Analysis,  Response 
Time,  and  Optimum  Stopping  Rules.  Finally,  we  also  searched  for  previous  work  on  first  passage  probability 
and  recursive  alternating  truncation  and  convolution.  After  all  these  serious  efforts,  it  is  our  conclusion 
that  the  expression  we  derived  for  the  SPRT/DDD  model,  as  reported  in  Subsection  4.2.3,  is  new  and  the 
calculations  involve  alternating  multiplication  of  functions  in  time  domain  and  frequency  domain  and  in  our 
view  cannot  be  further  simplified. 

On  the  other  hand,  we  have  used  the  continuous-time  drift  diffusion  (for  short,  ODD)  model  to  find  the 
necessary  and  sufficient  condition  for  the  SR-like  behavior  in  CR.  It  is  shown  that  the  following  condition  is 
’’necessary  and  sufficient”  for  the  SR-behavior  in  terms  of  metric  CR: 


(2^0  -  Ao  -  Ai)(Ai  -  Ao)  (2^1  -  Ao  -  Ai)(Ai  -  Ao)  ^  r, 

2d^  ""  2d^ 


(26) 


It  is  easy  to  see  that  this  condition  (26)  reduces  to  the  following  simplified  form  when  fiQ  =  fiQ  =  0,  Ai  >  0: 

1  , 

Ml  <  2/^1 

In  addition,  simulation  results  have  confirmed  our  theoretical  findings.  In  the  following  MATLAB  simulation. 
Figure  7,  the  SR-like  behavior  clearly  occurs  in  the  fraction  of  correct  response  when  the  actual  signal  strength 
is  much  weaker  than  anticipated.  The  vertical  axis  is  the  fraction  of  correct  response  for  various  threshold 
Z  while  the  horizontal  axis  is  cr,  the  standard  deviation  of  the  noise. 

From  the  CDD  model,  it  is  also  proven  that  the  height  of  the  peak  is  independent  of  Z,  and  the  position 
of  the  peak  (i.e.  the  optimal  noise  strength)  scales  with  \/Z  and  a.  As  a  result,  for  large  Z  or  a,  the  position 
of  the  peak  corresponds  to  a  strong  noise,  and  expectedly  extra  noise  is  helpful  under  the  proposed  condition. 
We  have  also  shown  that  to  guarantee  a  fixed  CR,  the  response  time  must  increase  linearly  with  the  variance 
of  the  noise. 
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Figure  7:  Fraction  of  Correct  Responses  vs.  Noise.  (SR  case) 


5.2  Differences  Between  the  Predictions  of  the  CDD  and  DDD  Models 

Through  a  comparative  study  supported  by  mathematical  analysis  and  computer  simulations  (see  Figures 
8-10),  three  major  differences  between  the  predictions  of  the  CDD  model  and  the  DDD  model  are  observed 
as  follows. 

•  First,  if  the  SPRT /DDD  model  selects  the  thresholds  according  to  the  equations  from  the  CDD  model, 
then  CR  (the  mathematical  expectation  of  accuracy)  would  be  better  than  desired,  but  at  the  price 
of  longer  response  time  than  the  anticipation  of  the  CDD  model. 

•  Second,  the  response  time  always  exhibits  a  peak  which  is  not  predicted  by  the  CDD  model.  As  a 
result,  if  the  actual  noise  strength  is  to  the  right  of  the  peak,  then  a  moderate  amount  of  extra  noise 
could  possibly  reduce  the  response  time  without  violating  the  constraints  on  the  Type  I  and  Type  II 
errors,  due  to  the  first  difference. 

•  Third,  the  tail  of  the  reward  rate  curve  drops  significantly  with  stronger  noise,  while  the  CDD  model 
predicts  constant  reward  rate.  Such  a  result  may  be  important  for  psychological  experiments. 

5.3  Nonzero  Initial  Conditions  and  Extensions  to  Other  Forms  of  Distributions 

For  the  purpose  of  testing  the  robustness  of  the  SR-like  behavior,  we  have  plotted  the  CR  when  the  initial 
diffusion  position  is  biased  (because  of  asymmetric  prior  probability)  or  asymmetric  constraints  on  Type 
I  and  Type  II  errors.  The  simulation  results  (see  Figure  11  for  example)  show  that  the  SR-like  behavior 
gets  more  evident  as  initial  bias  moves  toward  the  upper  threshold  and  weaker  if  the  initial  bias  is  negative. 
Numerical  studies  are  also  run  on  other  forms  of  distributions  such  as  Gamma  distribution.  See  Figure  12. 
For  the  Gamma  distribution  with  wrongly  assumed  shape  parameter  K,  it  is  observed  that  SR-like  behavior 
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Figure  8:  Response  Time  vs.  Noise.  (Optimal  case) 
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Figure  9:  Response  Time  vs.  Noise.  (SR  case) 
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reward  rate  RR  reward  rate  RR 


Figure  10:  Reward  Rate  vs.  Noise.  (Optimal  case  and  SR  case) 


22 


also  occurs.  Unlike  the  Gaussian  case,  the  peak  gets  higher  and  narrower  with  larger  Z  but  the  position 
of  the  peak  does  not  seem  to  move  much  with  Z.  Like  the  Gaussian  case,  the  response  time  also  exhibits 
a  peak  and  gets  longer  almost  linearly  with  increasing  Z  and  the  reward  rate  also  exhibits  a  peak  and  gets 
smaller  with  increasing  Z . 


Figure  11:  Fraction  of  Correct  Responses  vs.  Noise  with  Non-zero  Initial  Condition.  (SR 
case) 


5.4  Role  of  Prior  Distribution  in  Adaptive  SPRT 

Our  research  demonstrates  performance  improvement  in  terms  of  accuracy  (while  the  response  time  remains 
moderate)  when  using  the  adaptive  SPRT  algorithm  with  prior  distribution.  See  Figures  13  and  14. 

For  these  simulations,  we  assume  that  the  distributions  of  the  input  data  remain  constant  during  the 
trials.  As  a  result,  we  will  use  all  the  input  sequences  that  were  drawn  from  Rq  to  adapt  p{x\Hq).  We  also 
apply  equal  weights  to  all  the  input  sequences.  This  assumption  can  be  readily  relaxed.  For  example,  if  we 
assume  a  changing  environment,  we  can  apply  exponentially  decreasing  weight  for  the  sequences. 

To  be  more  specific,  the  first  time  of  decision  making  is  based  solely  on  the  prior  distributions  because 
we  do  not  know  which  distribution  should  be  adapted.  After  we  receive  the  feedback,  we  know  that  the 
input  sequence  in  the  first  trial  was  drawn  from  Rg,  then  we  adapt  the  parameters  of  p(x|Ro)  using  that 
input  sequence  (also  combines  with  the  prior  distribution)  and  apply  the  new  parameters  in  the  next  trial. 
Then  suppose  we  received  feedback  of  the  second  decision  and  knew  it  was  drawn  from  Ri,  then  we  will 
use  this  input  sequence  to  adapt  the  parameters  of  p{x\Hi).  Suppose  after  5  decisions,  we  knew  trials  1,  3,  5 
were  drawn  from  Rg,  then  we  estimate  p(a:|Rg)  using  input  sequences  1,3,5  (also  combines  with  the  prior 
distribution)  and  estimate  p(a;|Ri)  using  input  sequences  2,4  (also  combines  with  the  prior  distribution). 

For  comparison,  we  still  assume  that  erg  =  (Ti  =  cr  and  let  the  adaptive  process  update  cr,  i.e.  update 
p{x\Hq)  and  p{x\Hi)  simultaneously.  The  formula  for  the  adapting  process  is  equation  (20).  The  parameters 
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Figure  12:  Fraction  of  Correct  Responses  vs.  scaling  parameter  9.  Gamma  distribution  with 
parameters:  9  =  3,  Kq  =  2,  Ki  =  6,  Kq  =  2,  Ki  =  3.  (SR  case) 
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Figure  13:  CR  of  the  first  5  steps  versus  sigma,  M=10,  power=0.95,  significance=0.05.  jj,  =  Q, 
(Tp  =  20,  a  E  {10, 12,  ■  ■  ■  ,  30}.  Averaged  over  2000  trials. 
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Figure  14:  DT  of  the  first  5  steps  versus  sigma,  M=10,  power=0.95,  significance=0.05.  jJ.  =  Q, 
ap  =  20,  a  G  {10, 12,  ■  ■  ■  ,  30}.  Averaged  over  2000  trials. 
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are  also  chosen  to  be  the  same  as  in  the  previous  section. 

In  Figures  13  and  14,  we  plot  the  CR  and  DT  under  each  trials  to  see  how  fast  the  algorithm  converges 
to  the  real  distributions  and  reaches  the  desired  optimality.  The  results  are  also  plotted  with  respect  to  the 
mismatched  paramters  such  as  /ii  and  a  to  check  its  robustness.  The  result  shows  that  CR  converges  to  the 
desired  accuracy  while  DT  remains  reasonable  and  converges  to  the  performance  of  the  optimal  SPRT.  The 
first  trial  does  not  adapt  at  all.  It  only  benefits  the  following  trials.  As  a  result,  its  performance  is  the  same 
as  the  original  non-adaptive  SPRT.  The  performance  of  the  adaptive  SPRT  boosts  from  the  second  trial, 
because  the  first  step  of  SPRT  collects  a  lot  of  information  about  the  real  distributions. 


5.5  Extended  Decision  Field  Theory  for  Dynamic  Decision  Making 

Our  research  introduces  a  control-theoretic  approach  to  learning  in  dynamic  decision  making  tasks  to  the 
study  of  Sugar  Factory  task.  By  constructing  a  control  model,  it  presents  a  fairly  good  estimation  of 
automation  control  capability  to  participants.  Also,  the  model  provides  an  accurate  approximation  and  a 
reliable  reference  to  participants  through  the  demonstration  of  simulation.  Aiming  at  enhancing  appropriate 
trust  in  automation  in  a  supervisory  control  system,  a  modified  approach  to  the  previous  EDFT  model  is 
proposed  to  provide  a  more  accurate  approximation  of  trust.  Feasibility  is  demonstrated  by  both  theoretic 
analysis  and  simulation  through  a  Sugar  Factory  supervisory  control  system.  The  model  becomes  robust 
to  disturbance  irrespective  of  the  fluctuations  after  modification,  and  the  effectiveness  is  demonstrated.  See 
Figures  15-17  on  the  implementation  of  our  generalized  DFT  on  the  benchmark  example  of  SF  model. 


Figure  15:  Comparison  of  Bcai  and  Bca2  when  no  disturbance  exists  {h  =  2) 


6  Estimates  of  Technical  Feasibility 

Our  phase  I  research  reveals  theoretically  that  stochastic  resonance  like  behavior  arises  in  the  SPRT  model 
when  the  actual  input  signal  is  significantly  weaker  than  anticipated  by  the  model.  Theoretical  analysis  and 
computational  results  demonstrate  that  both  the  fraction  of  correct  responses  and  the  reward  rate  have  a 
peak  as  a  function  of  noise  strength  under  the  SR  condition  while  the  mean  decision  time  is  a  generally 
decreasing  function  of  the  noise  strength  with  the  exception  of  an  insignificant  peak.  That  appropriate 
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Figure  16:  Comparison  of  Bcai  and  Bca2  when  no  disturbance  exists  {h  =  100) 


Figure  17:  Difference  between  BcAiii  =  1,  2)  and  Ca  in  the  presence  of  disturbance  {h  =  2) 


amount  of  noise  can  improve  the  decision  making  process  (when  the  input  signal  is  significantly  weaker  than 
anticipated)  is  a  useful  conclusion  which  may  explain  human  and  animal  decision  making  in  psychophysical 
tasks.  The  novel  expressions  and  conditions  for  the  SR-like  behavior  in  the  SPRT  model  lay  a  solid  foundation 
for  experimental  validation  of  our  prediction. 

We  have  proposed  an  extension  of  Busemeyer’s  Decision  Field  Theory  to  handle  the  supervisory  control 
situations  where  manual  control  and  automation  coexist.  By  means  of  the  celebrated  contraction  map¬ 
ping  principle,  we  have  developed  an  improved  model  describing  reliance  on  automation  (trust  versus  self- 
confidence)  and  hopefully  will  help  the  operator  make  appropriate  use  of  automation  and  manual  control. 


(  Summary  and  Future  Work 

We  studied  the  SPRT/DDD  model  for  decision  making  under  the  condition  that  the  input  signal  is  much 
weaker  than  anticipated  by  the  model.  We  derived  expressions  for  calculating  the  fraction  of  correct  re¬ 
sponses,  the  mean  decision  time,  and  the  reward  rate  for  the  model.  Under  the  assumption  that  the  noise 
and  signal-plus-noise  inputs  follow  Gaussian  distributions  of  equal  variance,  we  found  that  the  fraction  of 
correct  responses  and  the  reward  rate  show  SR-like  behavior  when  the  actual  mean  signal  level  is  less  than 
the  average  of  the  noise  mean  and  signal-plus-noise  mean  used  in  the  calculation  of  the  likelihood  ratio.  Such 
discrepancy  between  the  actual  input  distributions  and  those  used  by  the  decision-making  process  is  likely 
to  occur  because  the  brain  does  not  have  direct  access  to  the  actual  input  distributions  in  a  psychophysical 
experiment  or  in  a  real-world  situation  but  have  to  rely  on  prior  experience  over  a  long  period  of  time  to 
estimate  the  distributions.  The  SR  condition  can  be  understood  analytically  by  using  the  closed-form  ex¬ 
pressions  from  the  CDD  model.  There  are  also  important  differences  between  the  SPRT/DDD  model  and 
its  continuous  counterpart,  CDD  model.  For  example,  if  the  thresholds  are  chosen  according  to  the  CDD 
equations,  then  the  accuracy  is  always  better  than  expected  while  the  response  time  is  always  longer.  The 
response  time  also  has  a  peak  which  can  be  useful  to  reduce  the  response  time  without  violating  the  con¬ 
straints  on  Type  I  and  Type  II  errors.  The  reward  rate  also  appears  lower  than  expected  and  gets  even  lower 
with  stronger  noise.  Finally,  to  establish  the  robustness  of  our  findings,  we  showed  that  a  similar  pattern 
of  results  is  obtained  when  the  initial  position  of  the  diffusion  process  is  varied  and  when  the  inputs  are 
drawn  from  gamma,  instead  of  Gaussian,  probability  distributions.  Based  on  the  study,  we  conclude  that  a 
moderate  amount  of  noise  can  improve  decision  making  when  the  input  signal  is  weak.  This  prediction  can 
be  tested  psychophysically  with  the  reaction-time  paradigm.  We  also  studied  the  role  of  prior  distribution 
in  adaptive  SPRT,  and  analyzed  its  effects  in  accuracy  and  reward  rate. 

Topics  for  future  research  include  the  systematic  study  of  the  signal-dependent  noise  case  and  generalizing 
our  preliminary  results  on  adaptive  SPRT  to  other  forms  of  distributions. 
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